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Abstract 

Background: DAPfinder and DAPview are novel BRB-ArrayTools plug-ins to construct gene coexpression networks 
and identify significant differences in pairwise gene-gene coexpression between two phenotypes. 

Results: Each significant difference in gene-gene association represents a Differentially Associated Pair (DAP). Our 
tools include several choices of filtering methods, gene-gene association metrics, statistical testing methods and 
multiple comparison adjustments. Network results are easily displayed in Cytoscape. Analyses of glioma 
experiments and microarray simulations demonstrate the utility of these tools. 

Conclusions: DAPfinder is a new friendly-user tool for reconstruction and comparison of biological networks. 



Background 

Microarray researchers need easy-to-use tools to identify 
differences in the coexpression and coregulation of genes 
between phenotypes that cannot be identified with tradi- 
tional tools. Often researchers compute Student's t-tests, 
analysis of variance (ANOVA), significance analysis of 
microarrays [1] or empirical Bayes analysis [2] for each 
gene on their microarray to identify individual differen- 
tially expressed genes (DEGs) among two or more phe- 
notypes [3]. Unfortunately, these approaches ignore 
coexpression because they cannot account for the com- 
plex multivariate relationships among genes. Multivariate 
statistical methods like hierarchical clustering and princi- 
ple components analysis (PCA) are often used for quality 
control and exploration of microarray data. However, 
these multivariate methods do not effectively model 
coexpression nor do they allow for hypothesis tests to 
compare phenotypes. Gene-gene association networks 
built using ARACNe [4], context likelihood relatedness 
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(CLR) [5], maximum relevancy (MR) [6,7] and other 
methods often provide helpful models of coexpression 
and coregulation, but the networks are based on data 
from a single phenotype and are not easily compared 
using statistical tests. New methods are needed to 
account for the complex relationships among genes while 
providing hypothesis tests to compare phenotypes. 

Several research groups have addressed the question of 
comparing the coexpression of specific gene-gene pairs 
or coexpression networks among two or more pheno- 
types. Two early examples used search algorithms to 
identify optimally sized clusters of coexpressed genes and 
resampling tests to identify significant differences among 
the coexpressed clusters between phenotypes [8,9]. Other 
published methods used variations on familiar statistical 
techniques like Fisher's Z tests or modified F-statistics to 
directly compare pairwise gene-gene correlations 
between two phenotypes [10-12]. Some of these meth- 
ods [10,11,13] are readily available as source scripts of 
package libraries in R http://www.r-project.org. Some 
interesting approaches apply the results from statistical 
tests that compare pairwise gene-gene associations 
between two phenotypes to the construction and inter- 
pretation of gene coexpression networks [10,14]. Both of 
these methods allow researchers to explore the complex 
differences among gene expression networks using statis- 
tical tests, but unfortunately neither method has been 
implemented in a user-friendly tool. 
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DAPfinder and DAPview are plug-ins for BRB-Array- 
Tools http://linus.nci.nih.gov/BRB-ArrayTools.html, which 
will provide researchers with accessible tools to test differ- 
ences in the coexpression between two phenotypes and 
explore those results on gene association networks. BRB- 
ArrayTools is a comprehensive microarray analysis pack- 
age that does not require specific skills in programming or 
direct script usage. It is available for free to non-commer- 
cial users and has more than 11,000 users in 65 coun- 
tries [15]. Our DAPfinder and DAPview tools will identify 
and visualize individual significant differences in gene- 
gene association between the two classes, each of which 
we will call a Differentially Associated Pair (DAP). Output 
from these tools can be used to construct gene-gene asso- 
ciation networks and identify the significant differences in 
coexpression between two groups. Our hope is that these 
tools can be used to identify systems-level features in the 
gene-gene association networks like network growth or 
decay, network merging or splitting, and network birth or 
death, reflecting functional changes in biological pathways. 

Implementation 

DAPfinder 

DAPfinder is used to compute pair-wise gene-gene asso- 
ciations (i.e. gene-gene correlations) for two groups of 
microarray experiments, then compare each specific gene- 
gene association between the two groups with a statistical 
test (Additional file 1, Figure SI). Gene-gene associations 
can be estimated using Pearson correlation coefficients, 
Spearman rank correlation coefficients, Kendall rank cor- 
relation coefficients or mutual information. Pearson corre- 
lations are the most familiar metric and the easiest to 
compute, but only the Spearman, Kendall and mutual 
information metrics are appropriate for nonlinear associa- 
tions between genes. Significant Pearson correlations 
within each class are identified using a one-sample Fisher's 
Z-test. Differences in gene-gene correlations (i.e. Pearson, 
Spearman and Kendall) are automatically tested using 
Fisher's Z-test methods, while optional permutation tests 
are used to compare differences in gene-gene correlation 
or mutual information. P-values from the Fisher's Z-test 
methods are approximate p-values that assume large sam- 
ple sizes; permutation tests make no assumption about 
sample size, but they require lengthy computation times. 
Permutation test calculations can be hastened by choosing 
from one of four gene-gene pair subset selection methods 
(Additional file 1, Figure SI). Tests can be computed with 
equal numbers of permutations for each gene-gene pair or 
with an adaptive method that identifies the minimum 
number of permutations required for each gene-gene pair. 
Fisher's Z-tests of individual Pearson correlations within 
each class or differences in correlation between the 
two classes can be corrected for multiple testing using 
false discovery rate (FDR) methods [16,17], q-value 



methods [18-20] or Bonferroni family-wise error rate 
(FWER) methods using step-up adjusted p-values [21]. 
The same multiple testing adjustments can be applied to 
the optional permutation tests. Researchers can pre-filter 
individual genes by the coefficient of variation (CV) of 
their gene expression, by a minimum sample size criteria 
(after outliers and missing data have been removed) or 
using the internal methods of BRB-ArrayTools. Research- 
ers can also upload a specific list of gene-gene pairs for 
testing. Outliers among the individual expression values 
from each gene can be removed using univariate standard 
deviation or interquartile range (IQR) criteria. 

Output from DAPfinder includes a hyper-text markup 
language (HTML) report and comprehensive output 
stored as an Excel spreadsheet or tab-delimited text file. 
The HTML report opens up automatically in a web 
browser to display the current user settings and diagnos- 
tics from the analyses. Reported user settings include 
choices of pre-filtering methods, association metrics and 
statistical tests, plus the directory location of the results. 
Diagnostics include the amount of missing data, the 
number of genes and gene-gene pairs used in the calcula- 
tions and the computation time required. Optionally, the 
10 most significant results from the Fisher's Z-tests and 
permutation test can be added to the HTML report. The 
comprehensive output includes the unique IDs and 
related annotations for both genes in each gene-gene 
pair, the individual gene-gene associations for each of the 
two groups with test statistics and p-values reported for 
the Pearson correlations in each group, the Fisher's 
Z-test statistics and p-values for comparisons between 
the two groups and finally the differences in association 
and permutation p-values between the two groups (if 
requested). These results can be sorted and reorganized 
in Excel to identify the most significant gene-gene asso- 
ciations in a single group, the most significant Fisher's 
Z-test results, etc. Results from the comprehensive 
output file can be directly imported into visualization 
software packages like Cytoscape [[22], http://www.cyt.os- 
cape.org] to create network graphs using the two col- 
umns of unique IDs to define nodes and the columns of 
correlation coefficients or p-values to define edge 
weights. Both the HTML report and the comprehensive 
output are automatically saved to the user's BRB-Array- 
Tools project folder. 

DAPview 

DAPview graphs the expression values for two specific 
genes in a XY scatter plot with the differences in coex- 
pression between two phenotypes displayed in different 
colors and symbols (Figure 1). Typically, a statistically 
significant difference in gene-gene association would be 
discovered using DAPfinder, then the relationship can 
be visualized with DAPview. The two groups are 
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Figure 1 Negative correlation between MYTL1 gene (probe set 
216672_at) and SOX5 gene (probe set 207336_at) in 
oligodendrogliomas (ODG) and no association between SOX5 
and MYTL1 in glioblastoma multiforme (GBM) illustrated using 
the DAPview 



graphed using different colors and symbols with a figure 
legend clearly identifying each group. Researchers can 
choose to identify, eliminate or ignore the outlier 
expression values identified by the same univariate stan- 
dard deviation or interquartile (IQR) range criteria from 
DAPfinder. Identified outliers are plotted in red, while 
eliminated outliers are completely removed from the 
graph and ignored outliers are plotted in the same col- 
ors as the legitimate data. Scatter plot graphs are auto- 
matically opened in portable document file (PDF) 
format and saved into the user's BRB-ArrayTools project 
folder. 

Results 

Evaluation of DAPfinder with Simulated Microarray Data 

The efficacy of the DAPfinder procedures was evaluated 
using simulated microarray data with known gene-gene 
correlations to ensure its statistical methods can detect 
known differences in gene-gene association with high 
levels of statistical power and low levels of false posi- 
tives. See the supplementary materials (Additional file 1) 
for details on the generation of simulated microarray 
data and other simulation methods. Simulation results 
were used to create receiver-operator characteristic 
(ROC) curves that explore the relationships between sta- 
tistical power, sample size and effect strength under sev- 
eral different simulation conditions. Other simulations 
examined the relationship between approximate p-values 
from the Fisher's Z-tests and exact p-values from the 
permutation tests. Simulations were conducted entirely 



in R using the same R source code used to build 
DAPfinder. 

Examining changes in Area Under Curve (AUC) for the 
ROC quickly revealed many properties of the analyses in 
DAPfinder. Not surprisingly, results from the simulations 
show that sensitivity (i.e. statistical power) and specificity 
(i.e. control over false positives) increase as sample sizes 
(«) or differences in correlation (delta = Ar = r, - rj) 
increase (Figure 2) when all other experimental condi- 
tions are held constant (Additional file 1, supplementary 
information). These results show the DAPfinder per- 
forms well even for relatively small sample sizes and dif- 
ferences in correlation. Increasing as the number of 
genes on each microarray chip did not affect sensitivity 
or specificity (Additional file 1, Figure S6), supporting 
our decision to use simulations with a small number of 
genes per chip because they are more efficient (see Addi- 
tional file 1, supplementary materials, for details). In the 
real world, increasing the number of genes per chip 
would decrease statistical power due to the more conser- 
vative FDR- and FWER adjustments for multiple testing 
and possible due to higher level interactions among large 
numbers of genes. However, these simulations computed 
ROC curves using unadjusted p-values and fixed num- 
bers of interacting genes. It may be surprising that sensi- 
tivity and specificity did not change as the expression 
variances of individual genes increased (Additional file 1, 
Figure S7) with all other experimental conditions held 
constant. However, increasing the individual gene expres- 
sion variance does not affect sensitivity and specificity, 
because the correlation of two genes is a property of the 
joint distribution that is not solely dependent on the 
magnitude of individual gene expression variances. Per- 
haps the most important simulation result showed that 
sensitivity and specificity increased as correlation coeffi- 
cients from the two groups changes from perfectly sym- 
metric with r,- - Yj = +0.5 - (-0.5) = 1 to increasingly 
asymmetric coefficients like r, - rj = +0.95 - (-0.05) = 1 or 
r; - Yj = +0.05 - (-0.95) = 1 with all other conditions held 
constant (Figure 3). Asymmetric correlation coefficients 
have more statistical power because the nonlinear Fish- 
er's Z-transformation used in the Fisher's Z-test inflates 
z-scores for strong correlations and deflates z-scores for 
moderate correlations (Additional file 1, Figure S8), 
creating larger differences in Z-scores and more signifi- 
cant Fisher's Z-tests. 

Additional simulations showed approximate p-values 
from the Fisher's Z-test of differences in Pearson corre- 
lation are strongly correlated to the exact p-values from 
the permutation tests, and the correlation between the 
approximate and exact p-values increases with sample 
size (Additional file 1, Figure S9). Similar correlations 
between approximate p-values and exact p-values are 
seen for differences in Spearman rank correlation and 



Skinner et al. BMC Bioinformatics 201 1, 12:286 
http://www.biomedcentral.eom/1 471 -2 1 05/1 2/286 



Page 4 of 8 



Effect of Sample Size (n) Effect of Delta 



CD 

O - 




Figure 2 Effects of sample size and and difference in correlation. Left. Effect of increasing sample size on ROC AUC with 40 genes per chip, 
250 simulation runs and constant delta = Ar = r, - = +0.5 - (-0.5) = 1. Right. Effect of increasing difference in correlation between classes from 
delta = r, - jj = +0.55 - (-0.55) = 1.1 to r, - r t = +0.95 - (-0.95) = 1.9 on ROC AUC with 40 genes per chip, 250 simulation runs and constant 
sample size n = 5 chips per class. 
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differences in Kendall rank correlations (Figure 4). 
These show the robust results from computationally 
intensive permutation and resampling tests can be very 
closely approximated by much faster Fisher's Z-test and 
similar methods for Spearman and Kendall rank 
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Asymmetry 
40 genes and 10 chips 

Figure 3 Effect of increasing asymmetry on ROC AUC. 

Asymmetry was increased from r, - rj = +0.5 - (-0.5) = 1) to t\- tj = 
+0.95 - (-0.05) = 1 with 40 genes per chip, 250 simulation runs and 
constant sample size n = 10 chips per class. 



correlations with reasonable sample sizes. Both options 
are included to provide researchers the option of faster 
computation when sample sizes are relatively large or 
more robust results when sample sizes are smaller. 

Discoveries from Glioma Data 

To illustrate some possible uses of the DAPfinder, we ana- 
lyzed transcriptional data from glioma samples publicly 
available in the Repository of Molecular Brain Neoplasia 
Data (REMBRANDT) [[23], http://caintegrator.nci.nih. 
gov/rembrandt]. We used data from oligodendroglioma 
(ODG) and glioblastoma multiforme (GBM) samples 
representing low and high malignancy primary adult brain 
tumors, respectively [24]. We identified significant differ- 
ences in Pearson correlation (p < 0.10 and r t - r, > 0.50) 
between ODG and GBM tumors for 727 gene-gene pairs 
which were consistent in tumors from two independent 
studies at Henry Ford Hospital [25] and the Glioma Mole- 
cular Diagnostics Initiative (GMDI) [26]. We constructed 
a gene-gene association network by focusing on a cluster 
of 27 gene-gene pairs (from 20 genes) with significant dif- 
ferences in Pearson correlation between ODG and GBM 
tumors and an additional 85 gene-gene pairs (from 56 
genes) that are connected to this cluster of 27 DAPs by 
correlations of similar strength and direction in both 
classes of gliomas (Figure 5). See supporting materials for 
details. 

We noticed three features in the network that were not 
necessarily expected. First, more than half of the genes 
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Figure 4 Effect of increasing sample size on the relationship between analytical (i.e. approximate) p-values from Fisher's Z-test 
procedures and permutation (i.e. exact) p-values 



from this network were differentially expressed between 
the two classes of glioma (46 out 76 genes). This suggests 
there may be a general correlation between differential 
expression and differences in association between pheno- 
types. Second, the relationship between differential 
expression and direction of correlation from consistent 
edges may represent potential regulatory relationships 
among genes. Positive correlations occur whenever both 
genes are up- or down-regulated, while negative correla- 
tions occur whenever one gene is up-regulated and the 
other is down-regulated. Note, because the correlations 
are estimated within the same type of samples, either 
ODG or GBM, the fact that genes are up- or down-regu- 
lated in GBM relative to ODG should not influence the 
correlation results. This phenomenon is seen in all 48 
correlations that are consistent between the ODG and 
GBM tumors. Third, the significant differences in gene- 
gene association seem to reflect the biological differences 
between ODG and GBM. Correlations that change direc- 
tion between glioma types typically show strong positive 
or negative correlations consistent with regulation in 
ODG, while having zero correlation in GBM. This sug- 
gests that evolution of the tumor may lead to the loss of 
regulatory relationships in the de-differentiating tissue. 
The gene-gene association shrinks from 76 genes and 
110 gene-gene pairs in ODG to 69 genes and 87 gene- 
gene pairs in GBM, suggesting systems-level network 
shrinkage from ODG to GBM resulting in loss of regula- 
tion functions. 

Among the significant correlation changes in the net- 
work, we find three genes (MYT1L, EGFR, POSTN) known 
to have meaningful roles in glioma pathogenesis [27-29] . 



Myelin transcription factor 1 (MYTL1) is upregulated in 
the less malignant ODG tumors and it is a major factor 
necessary for neuronal differentiation [30]. The significant 
difference in Pearson correlation between SOX5 and 
MYTL1 in ODG and GBM tumors is visualized with DAP- 
view (Figure 1). Epidermal growth factor receptor (EGFR) 
is a famous member of the erbB family of receptors that 
involved in regulation of cell proliferation and differentia- 
tion. Deregulation of EGFR was shown to have critical role 
in gliomas [31] as well as in several other malignan- 
cies [32-36]. Up-regulation in the protein-coding gene 
POSTN (periostin) is correlated with metastasis in both 
melanoma and breast cancer [37]. Although this analysis 
does not allow for definitive biological conclusions, it finds 
both previously established genes essential for tumorgenesis 
as wells as points to a new previously unexplored area of 
transcriptional regulation of gliomas. These results support 
the idea that estimating not only the structure but also 
changes in the co-expression gene networks can be a useful 
approach for understanding the disease process. 

Conclusions 

Analyses of empirical and simulated microarray data have 
shown that DAPfinder is a powerful tool to reconstruct 
and compare gene regulatory networks. Its design is not 
restricted to gene expression data from single channel 
and dual channel microarray experiments. The tool can 
also be used with expression data from RNA-Seq reads 
or it can analyze complex quantitative biological data like 
comparative genomic hybridization (CGH), metabolome, 
microbiome and proteome data. DAPfinder can also be 
used to compute gene-gene associations and construct 
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Yellow nodes indicate the gene is involved in a significant change 
in gene-gene correlation between two classes (i.e. DAP). 
Red borders indicate significant upregulated DEGs. 

Gray nodes indicate that no differences in gene-gene correlation 
between two classes are significant for the gene (i.e. no DAPs). 
Green borders indicate significant downregulated DEGs. 

Nodes with no border are not differentially expressed. 

Orange edges represent positive correlations. 
Solid edges indicate no significant differences in correlation. 
Bl je edges represent negative correlations. 
Dashed edges indicate significant differences in correlation with 
strong correlation in ODG and no correlation in GBM. 
Grey dashed edges indicate significant differences in correlation 
with strong correlation in GBM and no correlation in ODG. 
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Figure 5 A gene-gene association network created in Cytoscape using output from DAPfinder. Nodes with yellow fill identify genes 
involved in statistically significant DAPs. Nodes with gray fill identify genes that were not involved in any significant differences in gene-gene 
correlation between ODG and GBM tumors, but consistently correlate with the "yellow" genes in both ODG and GBM. Nodes with red borders 
indicate that a gene was upregulated in ODG, nodes with green borders represent genes that were downregulated in ODG and nodes with no 
border did not have any significant differential expression. Orange edges represent positive correlations between two genes in the ODG tumors. 
Blue edges denote negative correlations between genes in ODG tumors. Gray edges represent strong correlations in the GBM tumors, but no 
correlation in ODG tumors. Solid edges represent correlations that are consistent in both ODG and GBM tumors, while dashed edges represent 
statistically significant DAPs where correlations are not consistent among ODG and GBM groups. 



gene coexpression networks, even when there is not a 
second phenotype for comparisons of gene-gene associa- 
tions and networks. DAPfinder can be used within BRB- 
ArrayTools by biologists without specific skills in 
programming and/or direct script usage. Indeed, we have 
recently employed the tool in the meta-analysis of cervi- 
cal cancer gene expression and comparative genomic 
hybridization data revealing critical events of tumor pro- 
gression (Mine KL, Shulzhenko N, Yambartsev A, et al.: 
Reconstruction of an integrative gene regulatory meta- 



network reveals cell cycle and antiviral response as major 
drivers of cervical cancer, submitted). Future versions 
may extend the utility of the statistical tests and graphs 
to problems with 3 or more phenotypes, while alternative 
gene-gene association metrics and statistical tests can 
also be explored to ensure proper networks construction. 

Availability and requirements 

DAPfinder and DAPview may be downloaded for free 
from the NIAID Exon website http://exon.niaid.nih.gov/ 
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dapfinder/index.html. Complete installation instructions 
are provided on the website. DAPfinder and DAPview 
requires the installation of BRB-ArrayTools. BRB-Array- 
Tools currently requires the installation of Microsoft 
Excel, Java Virtual Machine, R 2.12.0 or higher and stat- 
connDCOM on computer using the Microsoft Windows 
operating system. DAPfinder and DAPview are BRB- 
ArrayTools plug-ins, which mostly utilize open source R 
script files. A complete description of the DAPfinder 
and DAPview files can be found in our supplementary 
materials (Additional file 1). DAPfinder and DAPview 
are also available to download as Additional Files 2 
and 3. 

Additional material 



Additional file 1: Additional information and Supplemental figures 
not included in the article. Additional Details About DAPfinder 
Methods; Development details of DAPfinder and DAPview; Validation of 
DAPfinder with Simulated Microarray Data; Discoveries from Glioma Data. 

Additional file 2: DAPfinder. DAPfinder plug-in software for BRB- 
ArrayTools. 

Additional file 3: DAPview. DAPview plug-in software for BRB- 
ArrayTools. 



List of abbreviations 

ANOVA: Analysis of Variance; ARACNe: Algorithm for the Reconstruction of 
Accurate Cellular Networks; AUC: Area Under Curve; CGH: Comparative 
Genomic Hybridization; CLR: Context Likelihood of Relatedness; CV: 
Coefficient of Variation; DAP: Differentially Associated Pair; DEG: Differentially 
Expressed Gene; FDR: False Discovery Rate; FWER: Family-Wise Error Rate; 
GBM: Glioblastoma multiforme; GMDI: Glioma Molecular Diagnostics 
Initiative; HTML: Hyper Text Markup Language; IQR: Interquartile Range; MR: 
Maximum Relatedness or Minimum Redundancy; ODG: Oligodendroglioma; 
PCA: Principle Components Analysis; PDF: Portable Document File; 
REMBRANDT: Repository of Molecular Brain Neoplasia Data; ROC: Receiver 
Operator Characteristic. 

Acknowledgements 

This work was supported by funding from the Department of Intramural 
Research (DIR) and the Office of the Director (OD) at NIAID, NIH. Vivek 
Gopalan, Supriya Menezes and Ming-Chung Li helped with the initial coding 
of DAPfinder. Vijay Nagarajan and Michael Dolan helped produce and edit 
the figures. Natalia Shulzenko discussed biological questions motivating our 
tool. 

Author details 

'Bioinformatics and Computational Biosciences Branch (BCBB), Office of 
Cyber Infrastructure and Computational Biology (OCICB), National Institute of 
Allergy and Infectious Disease (NIAID), National Institutes if Health (NIH), 
Bethesda, Maryland, USA. 2 Neuro-Oncology Branch (CHIAI), National Cancer 
Institute (NCI), National Institutes of Neurological Disorder and Stroke 
(NINDS), National Institutes if Health (NIH), Bethesda, Maryland, USA. 3 Setor 
de Imunogenetica, Departamento de Pediatria, Universidade Federal de Sao 
Paulo (UNIFESP), Sao Paulo, Brasil. 4 lnstituto de Matematica e Estatistica (IME), 
Universidade de Sao Paulo (USP), Sao Paulo, Brasil. 5 Biometric Research 
Branch (BRB), Division of Cancer Treatment and Diagnosis (DCTD), National 
Cancer Institute (NCI), National Institutes if Health (NIH), Bethesda, Maryland, 
USA. 6 "Ghost Lab", T-Cell Tolerance and Memory Section (TCTMS), Laboratory 
of Cellular and Molecular Immunology (LCMI), National Institute of Allergy 
and Infectious Disease (NIAID), National Institutes if Health (NIH), Bethesda, 
Maryland, USA. 



Authors' contributions 

JS was the primary software developer, critically contributed to the overall 
concept of the software, performed and analyzed the simulation 
experiments, and drafted the manuscript. YK analyzed the glioma data and 
simulation results, drafted part of the manuscript related to the glioma data 
and provided input on the development of the software. SV assisted with 
software development, contributed the adaptive permutation test feature to 
the software, performed and analyzed simulation experiments. KLM tested 
the software and provided input on the development of the software. AY 
conceived the original idea for the software, provided input on the 
development of the software and assisted with the generation of simulated 
microarray data. RS provided formulas for approximate tests of Kendall and 
Spearman correlations, contributed the single class correlation selection 
procedure to select gene-gene pairs for permutation tests. YH supervised 
the software development project, provided web distribution of the 
software and provided input on the development of the software. AM 
conceived the original idea for the software, provided the original 
specifications for the software, analyzed and interpreted glioma data and 
simulation results, drafted parts of the manuscript and provided scientific 
leadership of the project. All authors reviewed and approved the 
manuscript. 

Authors' information 

None. 

Competing interests 

The authors declare that they have no competing interests. 

Received: 7 March 201 1 Accepted: 14 July 201 1 Published: 14 July 201 1 

References 

1. Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays 
applied to the ionizing radiation response. P Natl Acad Sci USA 2001, 

98:5116-5121. 

2. Efron B, Tibshirani R, Storey JD, Tusher V: Empirical Bayes analysis of a 
microarray experiment. J Am Stat Assoc 2001, 96:1 151-1 160. 

3. Pan W: A comparative review of statistical methods for discovering 
differentially expressed genes in replicated microarray experiments. 
Bioinformatics 2002, 18:546-554. 

4. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla 
Favera R, Califano A: ARACNE: an algorithm for the reconstruction of 
gene regulatory networks in a mammalian cellular context. BMC 
Bioinformatics 2006, 7(Suppl 1):S7. 

5. Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, 
Collins JJ, Gardner TS: Large-scale mapping and validation of Escherichia 
coli transcriptional regulation from a compendium of expression 
profiles. PLoS Biol 2007, 5:e8. 

6. Ding C, Peng H: Minimum redundancy feature selection from microarray 
gene expression data. J Bioinform Comput Biol 2005, 3:185-205. 

7. Peng H, Long F, Ding C: Feature selection based on mutual information: 
criteria of max-dependency, max-relevance, and min-redundancy. IEEE 
Trans Pattern Anal Mach Intel! 2005, 27:1226-1238. 

8. Kostka D, Spang R: Finding disease specific alterations in the co- 
expression of genes. Bioinformatics 2004, 20(Suppl 1)1194-199. 

9. Xiao YH, Frisina R, Gordon A, Klebanov L, Yakovlev A: Multivariate search 
for differentially expressed gene combinations. BMC Bioinformatics 2004, 
5(20). 

10. Choi JK, Yu US, Yoo OJ, Kim S: Differential coexpression analysis using 
microarray data and its application to human cancer. Bioinformatics 2005, 
21:4348-4355. 

11. Dettling M, Gabrielson E, Giovanni P: Searching for differentially expressed 
gene combinations. Genome Biol 2005, 6(164). 

12. Lai Y, Wu B, Chen L, Zhao H: A statistical method for identifying 
differential gene-gene co-expression patterns. Bioinformatics 2004, 
20:3146-3155. 

13. Watson M: CoXpress: differential co-expression in gene expression data. 

BMC Bioinformatics 2006, 7(509). 

14. Mani KM, Lefebvre C, Wang K, Lim WK, Basso K, Dalla-Favera R, Califano A: 
A systems biology approach to prediction of oncogenes and molecular 
perturbation targets in B-cell lymphomas. Mol Syst Biol 2008, 4:169. 



Skinner et al. BMC Bioinformatics 201 1, 12:286 
http://www.biomedcentral.eom/1 471 -2 1 05/1 2/286 



Page 8 of 8 



17 



20 



22. 



15. Simon R, Lam A, Li MC, Ngan M, Menenzes S, Zhao Y: Analysis of Gene 
Expression Data Using BRB-Array Tools. Cancer Inform 2007, 3:1 1-17. 

16. Benjamini Y, Hochberg Y: Controlling the False Discovery Rate - a 
Practical and Powerful Approach to Multiple Testing. J Roy Stat Soc B Met 

1995, 57(1):289-300. 

Benjamini Y, Yekutieli D: The control of the false discovery rate in 
multiple testing under dependency. Ann Stat 2001, 29:1165-1188. 
Storey JD: A direct approach to false discovery rates. J Roy Stat Soc B 

2002, 64:479-498. 

Storey JD: The positive false discovery rate: A Bayesian interpretation 
and the q-value. Ann Stat 2003, 31:2013-2035. 

Storey JD, Tibshirani R: Statistical significance for genomewide studies. 

P Natl Acad Scl USA 2003, 100:9440-9445. 

Wright SP: Adjusted P-Values for Simultaneous Inference. Biometrics 1992, 
48:1005-1013. 

Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, 
Christmas R, Avila-Campilo I, Creech M, Gross B, Hanspers K, Isserlin R, 
Kelley R, Killcoyne S, Lotia S, (vlaere S, Morris J, Ono K, Pavlovic V, Pico AR, 
Vailaya A, Wang PL, Adler A, Conklin BR, Hood L, Kuiper (VI, Sander C, 
Schmulevich I, Schwikowski B, Warner GJ, Ideker T, Bader GD: Integration of 
biological networks and gene expression data using Cytoscape. Nat 
Protoc 2007, 2:2366-2382. 

Madhavan S, Zenklusen JC, Kotliarov Y, Sahni H, Fine HA, Buetow K: 
Rembrandt: helping personalized medicine become a reality through 
integrative translational research. Mol Cancer Res 2009, 7:157-167. 
Behin A, Hoang-Xuan K, Carpentier AF, Delattre JY: Primary brain tumours 
in adults. Lancet 2003, 361:323-331. 

Sun LX, Hui AM, Su Q, Vortmeyer A, Kotliarov Y, Pastorino S, Passaniti A, 
Menon J, Walling J, Bailey R, Rosenblum M, Mikkelsen T, Fine HA: Neuronal 
and glioma-derived stem cell factor induces angiogenesis within the 
brain. Cancer Cell 2006, 9:287-300. 

Li A, Walling J, Ahn S, Kotliarov Y, Su Q, Quezado M, Oberholtzer JC, Park J, 
Zenklusen JC, Fine HA: Unsupervised analysis of transcriptomic profiles 
reveals six glioma subtypes. Cancer Res 2009, 69:2091-2099. 
Ducray F, Idbaih A, de Reynies A, Bieche I, Thillet J, Mokhtari K, Lair S, 
Marie Y, Paris S, Vidaud M, Hoang-Xuan K, Delattre O, Delattre JY, Sanson M: 
Anaplastic oligodendrogliomas with 1p19q codeletion have a proneural 
gene expression profile. Mol Cancer 2008, 7:41. 

Mukasa A, Ueki K, Ge X, Ishikawa S, Ide T, Fujimaki T, Nishikawa R, Asai A, 
Kirino T, Aburatani H: Selective expression of a subset of neuronal genes 
in oligodendroglioma with chromosome 1p loss. Brain Pathol 2004, 
14:34-42. 

Wong AJ, Bigner SH, Bigner DD, Kinzler KW, Hamilton SR, Vogelstein B: 
Increased Expression of the Epidermal Growth-Factor Receptor Gene in 
Malignant Gliomas Is Invariably Associated with Gene Amplification. P 

Natl Acad Scl USA 1987, 84:6899-6903. 

Vierbuchen T, Ostermeier A, Pang ZP, Kokubu Y, Sudhof TC, Wernig M: 
Direct conversion of fibroblasts to functional neurons by defined factors. 

Nature 2010, 463:1 035-U 1 050. 

Aguirre A, Rubio ME, Gallo V: Notch and EGFR pathway interaction 
regulates neural stem cell number and self-renewal. Nature 2010, 
467:323-327. 

Huang PH, Xu AM, White FM: Oncogenic EGFR signaling networks in 

glioma. Sci Signal 2009, 2:re6. 

Libermann TA, Nusbaum HR, Razon N, Kris R, Lax I, Soreq H, Whittle N, 
Waterfield MD, Ullrich A, Schlessinger J: Amplification, enhanced 
expression and possible rearrangement of EGF receptor gene in primary 
human brain tumours of glial origin. Nature 1985, 313:144-147. 
Sainsbury JR, Farndon JR, Needham GK, Malcolm AJ, Harris AL: Epidermal- 
growth-factor receptor status as predictor of early recurrence of and 
death from breast cancer. Lancet 1987, 1:1398-1402. 
Veale D, Ashcroft T, Marsh C, Gibson GJ, Harris AL: Epidermal growth factor 
receptors in non-small cell lung cancer. Br J Cancer 1987, 55:513-516. 
Yano S, Kondo K, Yamaguchi M, Richmond G, Hutchison M, Wakeling A, 
Averbuch S, Wadsworth P: Distribution and function of EGFR in human 
tissue and the effect of EGFR tyrosine kinase inhibition. Anticancer Res 

2003, 23:3639-3650. 

Soikkeli J, Podlasz P, Yin M, Nummela P, Jahkola T, Virolainen S, Krogerus L, 
Heikkila P, von Smitten K, Saksela O, Holtta E: Metastatic outgrowth 
encompasses COL-I, FN1, and POSTN up-regulation and assembly to 



fibrillar networks regulating cell adhesion, migration, and growth. Am J 

Pathol 2010, 177:387-403. 



23 



24 



25 



26. 



28. 



29. 



30 



31. 



32 



33. 



34 



35 



36. 



37. 



doi:1 0.1 1 86/1471 -21 05-1 2-286 

Cite this article as: Skinner et al:. Construct and Compare Gene 
Coexpression Networks with DAPfinder and DAPview. BMC Bioinformatics 
2011 12:286. 



Submit your next manuscript to BioMed Central 
and take full advantage of: 

• Convenient online submission 

• Thorough peer review 

• No space constraints or color figure charges 

• Immediate publication on acceptance 

• Inclusion in PubMed, CAS, Scopus and Google Scholar 

• Research which is freely available for redistribution 



Submit your manuscript at 
www.biomedcentral.com/submit 



BioMed Central 



